skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Lee, Jun Ki"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This paper addresses the problem of training a robot to carry out temporal tasks of arbitrary complexity via evaluative human feedback that can be inaccurate. A key idea explored in our work is a kind of curriculum learning—training the robot to master simple tasks and then building up to more complex tasks. We show how a training procedure, using knowledge of the formal task representation, can decompose and train any task efficiently in the size of its representation. We further provide a set of experiments that support the claim that non-expert human trainers can decompose tasks in a way that is consistent with our theoretical results, with more than half of participants successfully training all of our experimental missions. We compared our algorithm with existing approaches and our experimental results suggest that our method outperforms alternatives, especially when feedback contains mistakes. 
    more » « less
  2. Mutually beneficial behavior in repeated games can be enforced via the threat of punishment, as enshrined in game theory’s well-known “folk theorem.” There is a cost, however, to a player for generating these disincentives. In this work, we seek to minimize this cost by computing a “Stackelberg punishment,” in which the player selects a behavior that sufficiently punishes the other player while maximizing its own score under the assumption that the other player will adopt a best response. This idea generalizes the concept of a Stackelberg equilibrium. Known efficient algorithms for computing a Stackelberg equilibrium can be adapted to efficiently produce a Stackelberg punishment. We demonstrate an application of this idea in an experiment involving a virtual autonomous vehicle and human participants. We find that a self-driving car with a Stackelberg punishment policy discourages human drivers from bullying in a driving scenario requiring social negotiation. 
    more » « less